Electronic material for: Modeling and Rendering Architecture from Photographs: A Hybrid Geometry- and Image-Based Approach http://www.cs.berkeley.edu/~debevec/Research/ Paul E. Debevec debevec@cs.berkeley.edu http://www.cs.berkeley.edu/~debevec/ Camillo J. Taylor camillo@cs.berkeley.edu http://HTTP.CS.Berkeley.EDU/~camillo/ Jitendra Malik malik@cs.berkeley.edu http://HTTP.CS.Berkeley.EDU/~malik/ Computer Vision Group http://http.cs.berkeley.edu/projects/vision/vision_group.html Computer Science Division http://www.cs.berkeley.edu/ University of California at Berkeley http://www.berkeley.edu/ ========== TIFF Images Here are electronic originals of the figures that we used in our paper. The numbering is the same, except for fig07b.tif which did not appear in the paper due to space limitations. More information, our latest results, and an expanded version of the paper are available online at: http://www.cs.berkeley.edu/~debevec/Research/ fig01.tif Schematic comparison of geometry-based and image-based modeling/rendering systems, and our hybrid approach. fig02ab.tif Image viewer showing marked features and model viewer showing recovered model images from the photogrammetric modeling system. This model was recovered from just the one photograph, which was made possible by embedding constraints of symmetry into the model. The tower is the Campanile at the Univeristy of California at Berkeley. fig02cd.tif Reprojected model edges, showing the accuracy of the recovered model (only edges belonging to front-facing faces are shown.) (d) A novel view of the clock tower generated from three images and view-dependent texture- mapping. The virtual camera position is 250 feet above the ground. fig07.tif Three of twelve images used to reconstruct a high school building (University High School in Urbana, IL), with marked features shown in green. The original images used were 768 x 512 pixels. fig07b.tif The edges of the recovered model, reprojected through the corresponding recovered camera positions and overlaid on the same three images. The fact that the blue reprojected edges conform correctly to the original photographs indicates that the building has been reconstructed accurately. Only edges belonging to front-facing faces are shown. fig08.tif Three views of the recovered high school model, rendered as flat-shaded polygons. The twelve recovered camera positions are all visible in the bottom picture. fig09.tif A novel view of the high school building (from about 25 feet above the ground) rendered with the view-dependent texture- mapping method. Some artifacts due to uneven exposure in the images can be seen toward the right of the image. Some trees were masked out of the original images to produce this rendering. fig10abc.tif A reconstruction of Hoover Tower in Palo Alto, California. As in fig02, this reconstruction is also made from a single photograph. The first image shows the original photograph, with approximately 50 user-marked edges. The second image shows the recovered model (since the top of the tower was not visible in the photograph, its height had to be guessed at.) The last image shows the results of projecting the first image onto the recovered model. The blue regions indicate areas that could not been seen in the original photograph. fig11.tif The process of view-dependent texture mapping. The top two images show projecting two individual images onto the building. The bottom left image shows how both projections can be composited using our view-dependent weighting function. The final image shows the results of compositing all twelve images using view-dependent texture-mapping. fig13.tif The benefit of view-dependent texture mapping. (a) A detail view of the high school model. (b) A rendering of the model from the same position using view-dependent texture mapping. Note that although the model does not capture the slightly recessed windows, the windows appear properly recessed because the texture map is sampled primarily from a photograph which viewed the windows from approximately the same direction. (c) The same piece of the model viewed from a different angle, using the same texture map as in (b). Since the texture is not selected from an image that viewed the model from approximately the same angle, the recessed windows appear unnatural. (d) A more natural result obtained by using view-dependent texture mapping. Since the angle of view in (d) is different than in (b), a different composition of original images is used to texture-map the model. fig14a.tif Key, Warped-Offset, and Offset images used in model-based fig14b.tif stereo algorithm. The key and offset images are original fig14c.tif pictures of the entrance to Peterhouse chapel at Cambridge University. The warped offset image was created by projecting the offset image onto a very basic model (two quadrilaterals) of the entrace, and then reprojecting into the key camera position. As a result, the structure of the scene is relatively easy to recover by comparing the key and warped offset images, rather than directly comparing the key and offset images. fig14d.tif A disparity map computed by model-based stereo algorithm. The brightness values are a function of the distance between the computed depth of the actual scene and the depth predicted by the approximate model. This disparity map can then be used to produce a depth map for the key image. fig16a.tif Rendered views of recovered chapel facade model, which are fig16b.tif full-size images of frames 68, 0, and 290 of movie6.mov fig16c.tif A depth map for each of four key images was recovered using model-based stereo. For each rendering, all four images were warped to the desired viewpoint using image-based rendering techniques. Lastly, the four warped images were composited using view-dependent texture-mapping to produce the final rendering. ========== QuickTime Movies movie1.mov Four images projected onto recovered high school model. A shadow buffer algorithm is used to compute which parts of the model are visible from the original camera positions. movie2.mov All twelve images projected onto the high school model, composited with view-dependent texture-mapping. Some trees and signs can be seen incorrectly projected onto the surface of the building. movie3.mov Same as movie2.mov, with obstructions (signs, trees) masked out of the original images. movie4.mov Fly-around of the chapel facade renderend with traditional texture-mapping. The facade appears flat. movie5.mov Fly-around of the chapel facade rendered with view-dependent texture-mapping of four images. No model-based stereo detail recovery has been performed. Since the model is such a rough approximation to the model's surface, view-dependent texture-mapping produces an undesirable amount of blurring. movie6.mov Fly-around of chapel facade with geometric detail recovered from model-based stereo and composited with view-dependent texture-mapping using the same four images. Since the original images are warped according to the scene's recovered structure, rather than the approximate structure of the model, the composited renderings are more realistic.